14 research outputs found
Deeper-GXX: Deepening Arbitrary GNNs
Shallow GNNs tend to have sub-optimal performance dealing with large-scale
graphs or graphs with missing features. Therefore, it is necessary to increase
the depth (i.e., the number of layers) of GNNs to capture more latent knowledge
of the input data. On the other hand, including more layers in GNNs typically
decreases their performance due to, e.g., vanishing gradient and oversmoothing.
Existing methods (e.g., PairNorm and DropEdge) mainly focus on addressing
oversmoothing, but they suffer from some drawbacks such as requiring
hard-to-acquire knowledge or having large training randomness. In addition,
these methods simply incorporate ResNet to address vanishing gradient. They
ignore an important fact: by stacking more and more layers with ResNet
architecture, the information collected from faraway neighbors becomes
dominant, compared with the information collected from the 1-hop and 2-hop
neighbors, thus resulting in severe performance degradation. In this paper, we
first go deep into the architecture of ResNet and analyze why ResNet is not
best suited for deeper GNNs. Then we propose a new residual architecture to
attenuate the negative impact caused by ResNet. To address the drawbacks of
these existing methods, we introduce the Topology-guided Graph Contrastive Loss
named TGCL. It utilizes node topological information and pulls the connected
node pairs closer via contrastive learning regularization to obtain
discriminative node representations. Combining the new residual architecture
with TGCL, an end-to-end framework named Deeper-GXX is proposed towards deeper
GNNs. The extensive experiments on real-world data sets demonstrate the
effectiveness and efficiency of Deeper-GXX compared with state-of-the-art
baselines
FairGen: Towards Fair Graph Generation
There have been tremendous efforts over the past decades dedicated to the
generation of realistic graphs in a variety of domains, ranging from social
networks to computer networks, from gene regulatory networks to online
transaction networks. Despite the remarkable success, the vast majority of
these works are unsupervised in nature and are typically trained to minimize
the expected graph reconstruction loss, which would result in the
representation disparity issue in the generated graphs, i.e., the protected
groups (often minorities) contribute less to the objective and thus suffer from
systematically higher errors. In this paper, we aim to tailor graph generation
to downstream mining tasks by leveraging label information and user-preferred
parity constraint. In particular, we start from the investigation of
representation disparity in the context of graph generative models. To mitigate
the disparity, we propose a fairness-aware graph generative model named
FairGen. Our model jointly trains a label-informed graph generation module and
a fair representation learning module by progressively learning the behaviors
of the protected and unprotected groups, from the `easy' concepts to the `hard'
ones. In addition, we propose a generic context sampling strategy for graph
generative models, which is proven to be capable of fairly capturing the
contextual information of each group with a high probability. Experimental
results on seven real-world data sets, including web-based graphs, demonstrate
that FairGen (1) obtains performance on par with state-of-the-art graph
generative models across six network properties, (2) mitigates the
representation disparity issues in the generated graphs, and (3) substantially
boosts the model performance by up to 17% in downstream tasks via data
augmentation
Self-planning Code Generation with Large Language Models
Although large language models have demonstrated impressive ability in code
generation, they are still struggling to address the complicated intent
provided by humans. It is widely acknowledged that humans typically employ
planning to decompose complex problems and schedule the solution steps prior to
implementation. Thus we introduce planning into code generation to help the
model understand complex intent and reduce the difficulty of problem solving.
This paper proposes a self-planning code generation method with large language
model, which consists of two phases, namely planning phase and implementation
phase. Specifically, in the planning phase, the language model plans out the
solution steps from the intent combined with in-context learning. Then it
enters the implementation phase, where the model generates code step by step,
guided by the solution steps. The effectiveness of self-planning code
generation has been rigorously evaluated on multiple code generation datasets
and the results have demonstrated a marked superiority over naive direct
generation approaches with language model. The improvement in performance is
substantial, highlighting the significance of self-planning in code generation
tasks
Outlier Impact Characterization for Time Series Data
For time series data, certain types of outliers are intrinsically more harmful for parameter estimation and future predictions than others, irrespective of their frequency. In this paper, for the first time, we study the characteristics of such outliers through the lens of the influence functional from robust statistics. In particular, we consider the input time series as a contaminated process, with the recurring outliers generated from an unknown contaminating process. Then we leverage the influence functional to understand the impact of the contaminating process on parameter estimation. The influence functional results in a multi-dimensional vector that measures the sensitivity of the predictive model to the contaminating process, which can be challenging to interpret especially for models with a large number of parameters. To this end, we further propose a comprehensive single-valued metric (the SIF) to measure outlier impacts on future predictions. It provides a quantitative measure regarding the outlier impacts, which can be used in a variety of scenarios, such as the evaluation of outlier detection methods, the creation of more harmful outliers, etc. The empirical results on multiple real data sets demonstrate the effectivenss of the proposed SIF metric
Recommended from our members
Coupling of polyhydroxybutyrate and zero-valent iron for enhanced treatment of nitrate pollution within the Permeable Reactive Barrier and its downgradient aquifer
Permeable Reactive Barriers (PRBs) have been utilized for mitigating nitrate pollution in groundwater systems through the use of solid carbon and iron fillers that release diverse nutrients to enhance denitrification efficiency. We conduct laboratory column tests to evaluate the effectiveness of PRBs in remediating nitrate pollution both within the PRB and in the downgradient aquifer. We use an iron-carbon hydrogel (ICH) as PRB filler, which has different weight ratios of polyhydroxybutyrate (PHB) and microscale zero-valent iron (mZVI). Results reveal that denitrification in the downgradient aquifer accounts for at least 19.5 % to 32.5 % of the total nitrate removal. In the ICH, a higher ratio of PHB to mZVI leads to higher contribution of the downgradient aquifer to nitrate removal, while a lower ratio results in smaller contribution. Microbial community analysis further reveals that heterotrophic and mixotrophic bacteria dominate in the downgradient aquifer of the PRB, and their relative abundance increases with a higher ratio of PHB to mZVI in the ICH. Within the PRB, autotrophic and iron-reducing bacteria are more prevalent, and their abundance increases as the ratio of PHB to mZVI in the ICH decreases. These findings emphasize the downgradient aquifer's substantial role in nitrate removal, particularly driven by dissolved organic carbon provided by PHB. This research holds significant implications for nutrient waste management, including the prevention of secondary pollution, and the development of cost-effective PRBs.24 month embargo: first published 26 December 2023This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]
Analysis of the ASMT Gene Family in Pepper (Capsicum annuum L.): Identification, Phylogeny, and Expression Profiles
Acetylserotonin methyltransferase (ASMT) in plant species, one of the most important enzymes in melatonin biosynthesis, plays a rate-limiting role in the melatonin production. In this study, based on the whole genome sequence, we performed a systematic analysis for the ASMT gene family in pepper (Capsicum annuum L.) and analyzed their expression profiles during growth and development, as well as abiotic stresses. The results showed that at least 16 CaASMT genes were identified in the pepper genome. Phylogenetic analyses of all the CaASMTs were divided into three groups (group I, group II, and group III) with a high bootstrap value. Through the online MEME tool, six distinct motifs (motif 1 to motif 6) were identified. Chromosome location found that most CaASMT genes were mapped in the distal ends of the pepper chromosomes. In addition, RNA-seq analysis revealed that, during the vegetative and reproductive development, the difference in abundance and distinct expression patterns of these CaASMT genes suggests different functions. The qRT-PCR analysis showed that high abundance of CaASMT03, CaASMT04, and CaASMT06 occurred in mature green fruit and mature red fruit. Finally, using RNA-seq and qRT-PCR technology, we also found that several CaASMT genes were induced under abiotic stress conditions. The results will not only contribute to elucidate the evolutionary relationship of ASMT genes but also ascertain the biological function in pepper plant response to abiotic stresses
Scalable production of few-layer niobium disulfide nanosheets via electrochemical exfoliation for energy-efficient hydrogen evolution reaction
Two-dimensional (2D) niobium disulfide (NbS2) materials feature unique physical and chemical properties leading to highly promising energy conversion applications. Herein, we developed a robust synthesis technique consisting of electrochemical exfoliation under alternating currents and subsequent liquid-phase exfoliation to prepare highly uniform few-layer NbS2 nanosheets. The obtained few-layer NbS2 material has a 2D nanosheet structure with an ultrathin thickness of ∼3 nm and a lateral size of ∼2 μm. Benefiting from their unique 2D structure and highly exposed active sites, the few-layer NbS2 nanosheets drop-casted on carbon paper exhibited excellent catalytic activity for the hydrogen evolution reaction (HER) in acid with an overpotential of 90 mV at a current density of 10 mA cm–2 and a low Tafel slope of 83 mV dec–1, which are superior to those reported for other NbS2-based HER electrocatalysts. Furthermore, few-layer NbS2 nanosheets are effective as bifunctional electrocatalysts for hydrogen production by overall water splitting, where the urea and hydrazine oxidation reactions replace the oxygen evolution reaction